5 research outputs found

    Anti-phishing System using LSTM and CNN

    No full text

    TamilSpell: A Tamil spelling error corpus

    No full text
    TamilSpell is a comprehensive four-column corpus that includes the sentence or word with errors, the corrected version of the sentence, the error code or tag, and the edit distance. The corpus comprises a total of 10,157,662 entries, encompassing 8,858,630 isolated errors falling under three categories (non-word errors, real-word errors and sandhi errors), 803,531 errors in 2-way combinations, and 495,501 errors in 3-way combinations. 1. Isolated errors (ISOLATED ERRORS.txt) 2. 2-way combinations (RWE+NWE.txt, RWE+SE.txt and SE+NWE.txt) 3. 3-way combinations (SE+RWE+NWE.txt) These data are made available under a creative commons attribution non commercial 4.0 International licence. https://www.creativecommons.org/licenses/by-nc/4.0/deed.e
    corecore